DOCUMENT RESUME 



ED 239 557 

AUTHOR 
TITLE 



INSTITUTION 
SPONS AGENCY 
PUB DATE 
GRANT 
NOTE 



PUB TYPE 



EDRS PRiCE 
DESCRIPTORS 



HE 016 981 



IDENTIFIERS 



ABSTRACT 



Friedman, Miriam; And Others 
Validating Assessment Techniques in an 
Outcome-Centered Liberal Arts Curriculum: Valuing and 
Communications Generic Instruments. Final Report, 
Research Report Number One. 
Alverno Coll., Milwaukee, Wis. 

National Inst, of Education (ED), Washington, DC. 
80 

NIE-G-77-0058 

62p.; Paper presented at the Annual Meeting of the 
American Educational Research Association (Boston, 
MA, April 1980). For related documents, see HE 016 
980-990. 

Reports - Research/Technical (143) — 
Speeches/Conference Papers (150) 

MF01/PC03 Plus Postage. 

College Curriculum; College Students; Communication 
Skills; Competence; ^Competency Based Education; 
Evaluation Methods; Higher Education; *Language 
Skills; *Liberal Arts; Moral Values; Outcomes of 
Education; *Research Methodology; *Student 
Evaluation; Test Validity; *Value Judgment 
*Alverno College WI 



in 

College. Two 
H c ommun i ca t i on s 11 



The methodology for validating assessment techniques 
a performance-based liberal arts curriculum was studied at Alverno 
generic instruments for assessing the competencies of 

and "valuing" were employed. A generic instrument is 
defined as one that assesses a competence level across content areas 
instead of using a large variety of instruments. The assessment 
system at the college requi res students to demonstrate incremental 
gains while progressing through six sequential levels in each 
competency area. Twenty students were assessed with the generic 
communications insttUm^mt after 2 years of college; another 20 were 
assessed upon college e&tranoe. Attention was focused on abilities 
speaking, writing, listening, and reading, as well as 
self-assessments of performance in each mode. Eleven students were 
assessed with the generic valuing instrument after 2 years of 
college, while 20 were assessed upon college entrance. Value and 
moral judgments and decision-making were evaluated using written, 
oral, and group decision-making modes. Attention was focused on the 
validity of the assessment technique, along with the validity of the 
competence. (SW) 



in 



definition of 



\ . 

**************************************** 

* Reproductions supplied by EDRS are the best that can be made * 

* from the original document. * 
********************* ************************************************** 



ERLC 



\ 



ro 

CD 
UJ 



VALIDATING ASSESSMENT TECHNIQUES IN AN OUTCOME-CENTERED 

LIBERAL ARTS CURRICULUM: 
VALUING AND COMMUNICATIONS GENERIC INSTRUMENTS 



Miriam Friedman Marcia Mentkowski 
Margaret Earley Georgine Loacker Mary Diez 



Office of Research & Evaluation/Valuing Division/Communications Division 

ALVERNO COLLEGE 



FINAL REPORT TO THE NATIONAL INSTITUTE OF EDUCATION: 
RESEARCH REPORT NUMBER ONE 



U.S. DEPARTMENT OF EDUCATION 

NATIONAL INSTITUTE OF EDUCATION 
EOUCAU0NAL RESOURCES INFORMATION 
t^T CENTER IERICI 

*^iis document has been reproduced as 
received fmm the person or organization 
originating ii 

Minor changes have been made to improve 
reproduction quality 

• ! J o-ms ol view or opinions stated in this docu 
m«n: do not necessarily represent ofhcial NlE 
posilmp or policy 



Funded by a grant from the National ln*+ ; tute of Education: 
Careering After College: Establishing the Validity of Abilities 
Learned m College for Later Success 
(NIE-G-77-0058) 

Principal Investigators: 

Marcia Mentkowsk i 

Austin Doherty 

Alverno College 

3401 South 39th Street 

Milwaukee, Wisconsin 53215 



ERLC 



An overview and rationale for our approach to the study of college outcorr.es, and a summary 
of the results from the following series of ten research reports, are found in: 

Msrcia Mpr ; owski and Austin Doherty. Careerirv 'Vfter ^ *e: Establishing the Validity o f 
Abilities ...earner in College for Later Careers oft.. di Performance. Final Rt 

to the National Institute of Education: Overview .'Summary. Milwaukee, Wl : Aiven.^ 
Productions, 1983. 

Research Reports: 

One: Friedman, M., Mentkowski, M., Earley, M., Loacker, G., & Diez, M. Validating 
Assessment Techniques in an Outcome-Centered Liberal Arts Curriculum: 
Valuing and Communications Generic Instrument, 1980. 

Two: Friedman, M., Mentkowski, M., Deutsch, B., Shovar, M.N., & Alien, Z, Validating 
Assessment Techniques in an Outcome-Centered Liberal Arts Curriculum: Social 
Interaction Generic Instrument, 1982. 

Three: Assessment Committee/Office of Research and Evaluation. Validating Assessment 
Techniques in an Outcome-Centered Liberal Arts Curriculum: Insights From the 
Evaluation and Revision Process, 1980. 

Four: Assessment Committee/Office of Research and Evaluation. Validating Assessment 

Techniques in an Outcome-Centered Liberal A ulum: Integrated Competence 

Seminar, 1982. 

Five: Assessment Committee/Office of Research and Evaluation. Validating Assessment 
Techniques in an Outcome-Centered Liberal Arts Curriculum: Six Performance 
Characteristics Rating, 1983. 

Six: Mentkowski, M., & Strait, M. A Longitudinal Study of Student Change in Cognitive 
Development and Gent ric Abilities in an Outcome-Centered Liberal Arts 
Curriculum, 1983. 

Seven: Much, N., & Mentkowski, M. Student Perspectives on. Liberal Learning at Alverno 
College: Justifying Learning as Relevant to Performance in Personal and 
Professional Roles, 1982. 

Eight: Mentkowski, M., Much/N., & Giencke-Holl, L. Careering After College: Perspectives 
on Lifelong Learning and Career Development, 1983. 

Nine: Mentkowski, M., DeBack, V., Bish^>, J„ Alien, Z., & Blanton, B. Developing a 
Professional Competence Model for Nursing Education, 1980. 

•A 

Ten: Mentkowski, M., O'Brien, K., McEachern, W., & Fowler, D. Developing a Professional 
Competence Model for Management Education, 1982. 



@ Copyright 1980. Alverno College Productions, Milwaukee, Wisconsin. All rights reserved under U.S., International 
and Universal Copyright Conversions. Reproduction in part or whole by any method is prohibited by law. 



ABSTRACT 



Two studies test methodology for validating assessment techniques in a 
performance-based liberal arts curriculum. Alverno College has a 
system-wide performance based curriculum, with an assessment process 
that requires students to demonstrate incremental gains while progress- 
ing through six sequential levels in each of eight competences. The 
t'ip.l - ..mces are integrated with the ro^ ; in each discipline. 
St c required to attain each ^ir level in sequence to 

do... , o cummulative achievement. TL «/o studies assess the 

effects of instruction on patterns of student response using instruments 
created to ensure cross-college credential ing on the same instruments. 
Both instruments are "generic," that is, general criteria are integrated 
with criteria specific to the way the. ability appears in the discipline 
in which the instrument is used. Studies of two generic instruments, 
assessing level 4 of the competences of Communications and Valuing 
are reported here. 

Twenty students performed on the generic Communications instrument after 
two years in college; another twenty performed upon entrance to college. 
Th ey d emons t r a t c d a b i 1 i t i es in four no ri e s of c ommun i cation: s p e ak i ng > 
writing, listening and reading, providing data on student performance 
across different modes of the same competence. The student is also 
asked to self-assess her performance in each mode on the same criteria 
on which she is judged by the assessor(s) . Eleven students performed 
on the generic Valuing instrument after twq years in college; another 
twenty performed upon entrance to college. Students demonstrated 
value and moral judgments and decision-making through written, oral 
and group decision-making modes. Students also self-assess their 
performance . 

In the Valu...^, study, the instruction group performed significantly 
better than the no instruction group. Data from the instruction 
group provided support for the validity of the cumulative hierarchical 
nature of the competence. The no instruction group did not show 
any consistent cumulative or sequential patterns. Overall, the 
instruction group demonstated clusters of relationships among scores 
on the criteria and the no instruction group appeared to perform in a 
randomly scattered manner, indicating effectiveness of instruction. 
In the Communications study, students with no instruction demonstrated 
a wider range of variability in performance as compared to the 
instruction group, who showed a less dispersed pattern. Student 
performance varies with the mode of communication, The instruction " 
group performed significantly better particularly on the upper levels 
of the four communication modes . The different patterns of the inter- 
relationships of student performance across the four modes are seen 
in relation to the levels. Students who had instruction can better 
self-assess their performance. 

The study methodology reflects our current pattern analysis approach 
rather than using score analysis, correlational analysis or an .item 
analysis approach alone. The interpretation of the results and the 
methodology developed have implications for similar programs which 
are seeking out new methods to establish construct as well as content 
validity of complex assessment techniques used in performance-based 
curricula in higher education. 



ACKNOWLEDGEMENTS 



We acknowledge the following faculty who assisted in designing the study, collecting 

the data, scoring the instruments or interpreting the results: 

Patricia Burns _ Everett Kisinger 

Mary Diez Gertrude Kramer 

Austin Doherty Georgine Loacker 

Margaret Earley Marcia Mentkowski 

Rita Eisterhold Marlene Neises 

Rosemary Hufker Jean Schafer 

Anne Huston Christine Trimberger 

Patricia Hutchings 



hunice Monroe assisted jn the data management. 



Paper presented at the Annual Meeting of the American Educational Research Association 
Boston, April 1980. 



5 



VALIDATING ASSESSMENT TECHNIQUES IN AN OUTCOME- CENTERED 
LIBERAL ARTS CURRICULUM: 
VALUING AND COMMUNICATIONS GENERIC INSTRUMENTS 



INTRODUCTION 

Some liberal arts colleges have recently been responding to a growing 
concern for the adequacy of students' professional and career preparation, 
by specifying the outcomes or abilities critical for future effective 
performance. These colleges have also taken the next step and created 
curricula to develop these abilities in each student in such a way that 
they can be expected to transfer to work settings after college. 

Such "outcome-centered" colleges fccus on assessing performance as 
well as knowledge as a key to bridging the gap between college and career. 
They have developed more nontraditional assessment techniques to capture 
both the learning and performance of these broalNabilities to enable 
faculty to iudjv the extent to which these competences have ^ rw i 

The purpose of this paper is to explore the issues n 
validation of these more nontraditional assessment techniques, and to 
illustrate, empirically, some ways/in which such validation studies may 
proceed* Validation of these techniques is particularly important since 
the le arning that results from, tlie use of performance-based assessment 
techniques are often an intrinsic part of the objectives and methods of 
competence-based curricula {King, 1979;. Further, validation of 
assessment techniques can be a cornerstone in establishing the validity 
of the abilities learned in college for later careering (Mentkowski & 
Doherty, 1977, 1983). 



6 



The faculty of outcome-centered or competence-based programs are 
concerned with validity issues. They are conct rned with the qi I ity of the 
learning process, including the assessment techniques, and with the extent 
to which learning outcomes measured are the result of instruction (internal, 
validity). They are also interested in how these outcomes measured by 
assessment techniques compare with what is possible for students to 
achieve— both in regard to outcomes credentialed and to the more 
"intangible" outcomes of college often thought to be related to future 
success. Further, colleges want to know, do the abilities learned in 
college impact graduates' future performance (external validity)? However, 
questions of external validity often follow questions regarding a program's 
internal validity. Thus, the reliability and validity of the techniques 
of assessment play a crucial role in any validity studies undertaken by 
outcome-centered colleges. 

As much as researchers may be tempted to apply existing theoretical 
validation models to these assessment techniques in to to, the methodological 
otraints embedded in outcome-centered r 1 :rams require "raditioi 
nidation strategies. The unique characteristics of these programs 
impact the assumptions underlying commonly accepted methodological - " 

approaches for establishing validity. Clearly then, in order to establish 
the internal validity of assessment techniques and the constructs under- 
lying them in a competence-based program, one needs to develop methods 
that are derived from the holistic, complex nature of outcome-centered 
curricula. Cronbach (1971) notes that investigations used for construct 
validation should be purposeful rather than haphazard; performance data 
should be interpreted within a given theoretical framework. Thus, 
nontraditional approaches to establishing the validity of assessment 



techniques — and the consequent: internal validity of the program— demands an 
all-encompassing view of outcome- centered programs and instrument character- 
istics and requires rethinking and re-evaluating existing validation 
methods . . 

The measurement techniques employed in competence-based education are 
derived from the program characteristics. Gams 0n (1979) identified three 
important similarities among competence-based Programs in higher education: 

(1) Educational outcomes reflect successful functioning in life roles; 

(2) Instructional time is independent of the achievement: of educational 
outcomes; and (3) Certification of achievement of outcomes is reasonably 
objective and verifiable. 

Three measurement implications can be derived from these similarities 
in program characteristics: (1) Measuring Successful functioning in life 
roles calls for performance assessment techniques rather than paper and 
pencil tests;. (2) There are no absolute <■ - >} f ' cr :tta , 

vp ■ '* °* instruments or instruction at a certain point in time, 

because added instructional time subsequently alters students' performance; 
and (3) Assessment techniques are most often criterion-referenced, since 
students are credent i a led or certified according to a specified set 
of criteria. 

These general descriptions of program and measurement characteristics 
are a beginning for rethinking and re-evaluating existing instrument 
validation strategies. As we b« *e worked to establish :he validity of 
technique^ in one su outcome-centered program, we have come *to realize 
that we must also understand the specific theoretical framework underlying '. 
this program— even though there are some similarities to other competence- 
based programs. The way in which the faculty works together to develop 



8 



instrument*, curricula and program improvements, must, also be taken into 
account. There is no substitute for "trying out" various methods for 
validating instruments, and then presenting the results to faculty, who ask 
questions of clarification, suggest directions, and spell out "what they 
want to know" about their instruments. 

For us, developing a conceptual framework for the validation of 
assessment techniques in an outcome-centered curriculum demanded that it be 
applicable across competences, disciplines, and instruments. Our validatio 
model wrs derived from the following sources: 

• The competence conceptual model defined by faculty 

• The assessmep- Lgned to meuoui students 1 performance 

• The character s of the ; essment techniques 

• Ongoii validation studies submitted to the faculty for critique 

• Faculty questions 

The following sections provide a brief glimpse of each of the above 
named sources, and results in a description of the questions that guided 
the validation strategies used to validate two instruments — the "empirical 
i 1 lustrations" that follow. 

Competence Conceptual Model 

The theoretical and pedagogical framework of eight competences as 
defined by Alverno faculty constituted our frame of reference. 1 The 
Alverno curriculum centers on student competence as outcomes. Students 

L By framework we refer to the competence conceptual model as outlined 
in Liberal Learning at Alverno College . 1976. 



arc required {>> Jcnions crate masier> ol ei>;ht i.'.om-H-Lenrcs : 

— Kfiective cornmuni cat ions ahili t:v 
— A i \ a 1 y C i c a i c a p a b ill*: v 

— Pr < ) b 1 e m s o 1 v i. n g a bill t. y 

— V a 1 u i n g in a d e c i s i < i n ni a k i n g c o n L e x t; 

— Effective social interact ion 

— {■'. Cfective n e s . in i n. d i. v i. d u a 1 / e n v i. r o nm e n L r e 1. a t i. o i < s h i i j s 

— Responsible involvement: in c he conteniporarv world 
— A e s L h e C i c r e s p o n s 1 v e r» e s s 

The conceptual framework underlying the competences is defined as 

Generic, Developmental and Holistic. The assessment techniques are 

created to follow these concepts* Consequently, faculty design 
i * 

instruments according to the following questions: 

/ 

— To what extent can the student demonstrate the same ability in a 
variety of settings (Generic)? 

— To what extent does e student demonstrate a progressive learning 
pattern ( Deve lopment a 1 ) ? 

—To what extent does the student demonstrate integration of 
competences in a single performance given that the competences 
are inseparable parts of the whole person (Holistic)? 

Assessment Syste m 

The Alverno assessment system requires students to demonstrate incre- 
mental . gains while progressing through six sequential levels in each of the 
eight competences. Students are requirec to attain competence levels in 
sequence and to demonstrate cumulative mastery. Multiple assessors, 
multiple contexts and multiple modes of assessment add to the quality 
assurance of the assessment process. The competences are integrated with 

;" e cm " lthln various dlsclpllnos ' 

10 



1 



una r ac on ;-.» t j. c s . 1 1 Ajj v; s ■-: men i. [ e c a r; i iju e s 

.Since t:u- a :, <: os ^mti it system requi res a wiJ.c riclw::ik oi assessment, 
techniques , lacul.ty members L rid i v idun I ly ana jo ■; u:. ly dcsi.cn, evahiau* jihi 
rev i He ios i rumen ts A wi thin disci pi incs , ilejuj r L radii t: s and ;. nterdi sci pi i i: ->rv 
CvCipetcncf? division-:. The i ns t rument s arc ho signed to; 

— nrasurt; the learning ob jecti vos for the competence level 

— elicit: the lull nature of the ability 

--•provide opportunities to iircegrate contend and competence at «in 

p p to p r i a z e level oi .-. .on' i :■■ r i r a t ion 

— measure the integration n£ the competence with other relevant 
competences 

•--use o production tnsk rather than a recognition task 

— upc an assessment mode similar to the ability as usually exprelWed 
rather than on. artificial mode 

—allow for the judgment of performance against public and explicit 
criteria, by the assessors) and by the student in a self -assessment 

—allow for administration external to the learning situation 

— provide diagnostic , structured feedback to the stud* '■ on her 
strengths and weaknesses 

—provide evidence for credent ia 1 ing student's performance 

\ Ougoing Validation Studi es 

The process of conducting separate, independent validation studies led 
us toward development of a more general validation framework applicable 
across competences, disciplines and instruments. Findings from validation 
studies establish new frameworks for subsequent studies by broadening the $ 
scope and generating a "pool" of validation methods. 



By instrument we mean the set of criteria employed to evaluate 
student performance while reacting to a specific stimulus. 



9 

ERLC 



aivso .luring i c '« j . I t y i uv-.-I veinv;:- wi !:h t 1 } 1 1 1 process ■ :>■ roi in i nu; and 
revisiug I: he instruments* (.' J ') i'v s p-,< nJ i m-; results from initial on.:;- .« i n^ 
studies, and ( 3 > brainstorming ■ juus t ions in specially d e s i g n. e d 
;•• t- s h i a ns . Somt; oi the questions generated were: 

— Art 1 the? instruments created Co assess what, wo want: to assess'' 

— Dd .students perform success in L Ly in the assessment process is the 
result of learning experiences they complete? / 

— ! s s tin 1 1: n t ' s perl" c > r m unce o n a particular t e c h n 1 q u e f o 1 1 o win 
instruction a true representation of what, she has Learned .nui 
can do? 

— How do the competences dil'ier and how do you get at the rii. flerence? 

— ! 1 o w b e s t c a n w a u s e g r uup d a t a 1 r o in student p e r f o rm a nee on our 
instruments to Lest out assumptions about the complex nature of a 
given competence? 

— Flow best do we assess whether the competence levels are truly o£ 
s e q u e n t_ 1 1 c o m p 1 e x i t y ? 

— I low can we best describe the patterns of a student's performance 
across time as she progresses through the competence levels? 

— How d o we describe or char t i n c r ea sing 3 y c om p 1 e x gains in student 1 s 
performance? 

These and other similar questions indicated that the faculty are 
interested in the validity of :he instrument criteria and the extent to 
which the instruments measure the effectiveness of instruction. faculty 

are even mope interested in the construct validity of each of the eight 

/ 
t 

competences. They wish r.o achieve greater insight into the underlying 
meaning of each of the eight dimensions and the developmental y cumulative 
nature of the competences. These faculty questions provided direction for 




a 



l?te ./real, i on of -Mjr validation ! rameyork and con finned our earlier objee- 

: - iVl '- : * t:T ' o>! ah i ;shin>: t he internal validity of the compel euro:; and 

assessment t ee hn iq;u (Ment kowsk i £ Dohertv, 19 77 ): ) 

•tatablish the validity of the techniques used to assess 
students' behavioral performance of the competences by adapting 
or developing validation strategies appropriate for use with 
Qnntradi t i or ui I a.Wss*ment techniques; 



# compare student performance across and within coranetenreK to 
luiMher re!. me the nature ni the. competences and their 

; 1 1 o r r e* I a i i. o n s h i p m ; a. n d 
KxanU:e> tin* relationships hctween student performance and 
e>: i erret ! criterion measures . 



The Kttju r i eal N ljo>rrat ions 
: »/al s! -:d : es doscr \ bed in t h i s paper respond to these 
; • •' validitv. V. r e rirst establish the validitv of 

'"'a' 1 hv d' ,r '.e'ns t ra : i o.r. the e s" f i.-e t --• of instruction en 
' is .«;■ : n : • ! ! n. "; * ■ 0 , i in ins t rue t ed i^reup rer^par 

: ■ o • r i : ■ r i ■: ev : i <n . Then uv est ah ! : ,b r h*- 
'■ and devo : epira-s; t a i , eurmi 1 a I. i v«. aaturi 

•* . \ r '■-■!! v i ■ r v ■ ' * • I " t is i ex j e ra t : urt at i h»* uiulr r 1 v • n e 



nil assessment technique means conducM 
** ' >*' ex »■ e t ! ■ • w'u :';i f h e » n?; ! rueiteO 



Vo 

/ 



m oo !-e ! ween ( lie p«« r ! rn/ 



er|c - 1 l,j 



9 



3. Is variability in student performance within the instructed group 
different from variability within the uninstructed group? What 
•■ r are the directions of such, differences within a single competence 
"level? Across competence levels? Can we specify a desired 
variability pattern given the competence learning objectives? 

\ 

Establishing; the Validity of the Meaning 
*jn_d Dev elopmental Nature of the Compete nee 

Then ve establish the construct validity of the meaning and develop- 
mental, cumulative nature of the competence. A competence is defined as 
biiity that can be broken open into several components and specified 
/^developmental ly 1 at each of six sequential levels. For us, establishing 

the construct, validity of a competence means verifying the expert judgment 

> 

or interpretation"' of student performance against the competence as defined 
by faculty, and exploring the meaning of the competence in light of the 
empirical data. This means we must investigate the relationship among 
■ .ittpeliMKV criteria before and aft.,; instruction and examine the extent 
1 1 v bi,!: :c.o e»\7av tunce ,: e f i \ i i o n can account for all the demonstrated 




The questions th.it guid*- the attempt to establish construct validity 
art- dvriwd from the generic, developmental and holistic nature of 

V>mpe t ence de t i n i t ion : 
' 

hlv use the word developmental to imply sequential levels of an 
ahilitv so spec iiied for pedagogical reasons. They are not cognitive- 
dove 1 opmenl a I. "stages . " 

The m.ijsrity of tin- assessment, techniques employed in the college yield 
inierential data generated by expert, judgment. Within the Alvernu learning 
process, irrespective <0 ; where students are assessed, their proficiency at a 
competence level is evaluated bv faculty who share the same understanding oi 
tin: meaning oi a given competence and use similar criteria. Even off-campus 
assessors participate in training workshops and adopt. Alverno's competence 
deiinitiou and criteria as a basis for their judgment. In establishing 
construct validity, we were actually attempting to establish the validity 

assessors' interpretation oi student performance against a set oi 
cwrceteuee critt-ria. As Me s sick <!*>7 ( >.) notes, our task is to validate nut 
1 f< '-* ! ' ,u ' interpretation ( .| data arising from a spec?* lie procedure, 

ERiC 14 



10 



U Can we identify improved associations among separate components of 
a competence within an instructed group of students as compared to 
an uninstructed group? r ^ 

2. Can we identify definite patterns of response in an instructed 
group of students which are different from patterns of response 
in an uninstructed group? 

3. Do clusters of performance form the unidimensional ability 
specified in the competence definition? 

4. - To what extent can we attribute differences between instructed 

and uninstructed students to the sequential or cumulative nature 
of the competence levels? 

5. Does the attainment of one component of a competence facilitate 
attainment of another component? 

6. What are the prerequisite skills needed to acquire new abilities 
in a given competence? 

Developing Validation Strate gies 

Because our evolving validation model is derived from the internal 
framework of the program characteristics, we pay close attention to the 
corresponding implications for measurement we outlined earlier (see p, 3): 

1) measuring performance rather than paper and pencil tests, 

2) measuring student : s progress as a function of instructional time, 
and 

3) criterion referenced measurement techniques, 

Alverno faculty measure performance in action rather than just on 
paper and pencil /tests; they measure a student's progress as a function of 
instructional time, and they use criterion referenced measurement techniques. 

Student performance on assessment -r -hniques is examined by a variety 
o! strategies. We establish the validity of assessors 1 interpretation of 
student performance against the competence criteria or definition, and then 
establish inter-rater re! .ability of assessor judgments. We investigate 
pat terns of performance, their d i f f e vviur s and similarities, within and 
between contrasted groups (instructed .had un i nst rue t ed) , We also attempt 

ERLC io 



11 

to establish the validity of the sequential, developmental and cumulative 
nature of the competence levels. As for the time dimension, i.e. the 
rate of competence attainment, our validation strategies focus on the 
range of individual differences, direction in variability changes, magni- 
tude of instructional effect, reduction in student variability while 
progressing upward on the mastery continuum, and establishing baseline for 
entering students. The use of criterion referenced assessment techniques 
direct us toward an analysis of relationships among criteria, criteria 
evaluation and identifying possible cutoff points for credentialing. 
In our view, a valid assessment technique in a competence-based 
framework will show evidence of reduced variability in the instructed group. 
Since masteiy for Alverno students is not viewed as an all-or-none outcome 
but as a continual process, we consider an instrument at least partially 
valid if the instn ted group shows a decrease in variability. An instructed 
group should perform significantly bettor on the entire instrument, as well 
as on the individual competence criteria. If the magnitude of the instruc- 
tional intervention accounts for at least 25% of the variation in the 
instructed group, we accept this as evidence that improved performance is 
not due only to individual differences, but also to the effectiveness of the 
learning experiences . 

Ve expect that instruction will improve associations among components 
of a given competence whereas an uninstructed group will show weak associa- 
tions. Improved associations should form clusters of abilities which will 
conform to the definition of the competence. We also expect that the 
instrument: is measuring the un i d imens ional abilities of a single competence* 
Is the instrument valid if it measures other abilities as well? We arrived 
at the term "improved assoc. tat. ions" win Lc selecting a construct validation 1 

ERLC lb v 



12 

technique that identifies relationships among variables, i.e. the behav- 
ioral manifestation of the ability under study (Payne, 1975, p. 113). 
In a comparative analysis (instructed vs. uninstructed group) patterns 
of relationships can then be identified separately within the two groups, 
and then compared. 

Let us suppose that the corononents of analytic ability are to observe, 
to make logical inferences and to draw relationships. Faculty who wish to 
educate toward those components or skills design a curriculum which will 
enhance the development of analytic abilities. Learning objectives can 
then be verified against actual student performance. Students who complete 
the learning sequence are expected to demonstrate "improved associations 11 
among variable:; which measure observation skills and the ability to make 
logical inferences and to drav: relationships, whereas students who just 
entered the program will dprr.onstrate random associations among the skills 
under study. Furthermore, the instructed group will show a clustering of 
the three components, forming the analysis competence. 

Since the competences are also defined developmentally in a pedagogical 
competence model, we expect that the uninstructed group will not demonstrate 
a coherent sequence and will deviate from the one specified. If a 
competence is developmental, instructed students will demonstrate the 
prescribed sequence of the competence levels and cumulative mastery. 

By employing a multi-analysis approach within the contrasted instructed 
and uninstructed groups, we departed from existing construct validity 
methods such as factor analysis, convergent and discriminant validity 
methods. We preferred a pattern analysis approach within two contrasted 
groups and explored the differences and similarities in patterns. 

The strategies employed for validating assessment techniques, and 



13 

the competence definition and its developmental nature were characterized 
by a reciprocal process in which preliminary results from validation 
studies were communicated to Competence Divisions (analogous to Discipline 
Divisions in their function) who in turn expand or generate the questions 
that further direct the analysis. Such a field approach provides prompt 
feedback to faculty for instrument revision and ensures the researchers 1 
sensitivity to the faculty ! s internal frame of reference. 

We wish to emphasize that the purpose of the present study is to create 
a validation model rather than to report results. Since the two empirical 
illustrations involve small numbers of/ students we have chosen to emphasize 
the method by which we analyzed and interpreted the data rather than the 
actual outcome or results. If our methods prove effective in validating 
behavioral data within a competence-based program, we would suggest further 
tests of our validation strategies with larger samples. 

The Instruments 

Since the following empirical studies are part of our continuing efforts 
to develop a validation model applicable across instruments irrespective of 
the competence or discipline they intend to measure, we selected generi c 
instruments at level 4 for our empirical illustrations. Generic instruments 
are designed to ensure cross-college credentialing on the same instruments 
instead of using a variety of instruments for the same purpose. 1 

Briefly, a generic instrument is one which assesses a competence level 
across content areas instead of using a large Vdriety of instruments, each 
of which must be validated. In a generic instrument, general criteria can 
be integrated with criteria specific to the way the -ability appears in the 

l Aftcr approximately two years in college, students contract for 
credentialing at level 4 of a given competence. 



fc^C 18 



14 



discipline or content area in which the instrument is used. 

Several Competence Divisions have created generic assessment 
techniques to assess for their respective competences at level 4. This 
represents a step toward increased consistency in assessment at the level 
where general education requirements are certified. It also allows 
greater comparability across disciplines as we evaluate our assessment 
techniques, and provides a more uniform data base for comparison with 
external criterion measures and for longitudinal studies. Validation ' 
studies on two of these instruments, the Valuing generic instrument and 
the Communications generic instrument, were conducted 'fior the purpose^ of 
this study. 

The Valuing in Decision Making competence focuses on developing the 
student's ability to use a valuing process with a number of components 
including discerning value and moral issues and resolving value and moral 
conflicts. 1 The instrument is designed to elicit value and moral judgments 
and decision making through written, oral and group decision-making modes. 

The Communications competence focuses on the process of effective 
clarification and involvement between a presenter and an audience. Students 
performing on the generic Communications instrument after two years in 
college demonstrate abilities in four components or modes of the Communica- 
tions competence: Speaking, Writing, Listening and Reading. Their perfor- 
mance provides data across different modes of the same competence. 

The Valuing Division was seeking to validate the pilot administration 
of the Valuing generic instrument. At that juncture they were concerned 
with a variety of validity issues: How well does the instrument measure 

K.r a description of how Valuing is taught and assessed at Alverno, 
refer to: M. Earley, M. Mentkowski and J. Schafer, Valuing at Alverno: 
Ilg _^inaJProce, SS in Liberal Education (Mi 1 waukee : M verno Productions , 



15 

the effects of instruction? How well does it measjre competence levels? 
Does it discriminate between a group of instructed and uninstructed 
students? Preliminary results from an initial study (Valuing Generic 
Instrument: Study A) were reported to the Valuing Division who then 
generated further questions which focused on evaluation of the instrument 
criteria, and the developmental, sequential and cumulative nature of the 
Valuing competence. The researchers then proceeded to respond to these 
newly defined faculty interests through a broader scope of validation 
strategies. The subsequent analysis explored variability of students 1 

performance, distribution of scores within the separate competence levels, 

t. 

and the sequential, cumulative nature of the competence by way of correla- 
tion matrices. 

The Communications study benefited from the insights learned from 
the Valuing study. We also had a better understanding of faculty 
concerns, probably because communications has always been explicitly 
taught in college, Although the Communications study began with a more 
narrow perspective appropriate for purposes of instrument revision and 
criteria evaluation, the scope and the range of issues again broadened, 
directed at the attempts to validate the competence model. It is 
this part of the study, we believe, which contributes the most to valida- 
tion methodology in competence-based programs in higher education. The 
multi-analysis approach employed within two contrasted 'groups provided 
insight into differences as well as similarities in patterns of student 
performance on the Communications competence. ' The analysis techniques 
selected aj;e more commonly used in other areas of the social sciences. 
Yet they yielded a greater understanding of the relationship between 
competence criteria and students' actual performance. 



ERIC 20 



16 



VALUING GENERIC INSTRUMENT: STUDY A 
Method 

Subj ects 

The generic instrument assessing the Valuing competence, level 4 was 
administered in January, 1978 to a group of new students entering Weekend 
College 1 (WEC) who had no previous instruction at Alverno College 
(uninstructed group). These 20 students were randomly selected from all 
new students entering WEC Semester II (n = 60). During Spring, 1978, the 
same generic instrument was administered to 11 Weekday College students 
contracted for an assessment of Valuing, level 4 as part of their learning 
sequence (instructed group) . Level 4 is usually achieved at the end of 
general educational sequence aftejr two years in college. 

Design 

We selected students from Weekend College for this comparison in order 
to control for the effects of maturation. Since the Valuing competence 



1. 



In Fall, 1977, Alverno College instituted a Weekend College. The 
Weekend College is an opportunity to earn a four year college degree by 
going to college every other weekend from late August through May. It 
was planned for women of all ages who wish to earn a college degree, 
but are unable to attend weekday and evening classes. Classes involve 
intensive study, a close working relationship with instructors and 
fellow students, and maximum opportunity for self-directed study. A 
semester of Weekend College is equivalent tf'^s ernes ter of Weekday 
College. The scheduling of courses within lirtited time frame and 
the resulting intensification and concentration \of study distinguish 
Weekend from Weekday College. Bachelor programs are available in the 
major areas of communications, management, and nursing. All students 
take courses in liberal arts, which are designed to complement the major 
and provide a breadth of knowledge. Because of the intensive nature of 
Weekend College, it is necessary for students to function as self-directed 
learners. An introductory course designed to provide students with the 
independent learning skills they need is, therefore, a required part of 
the curriculum. Currently, approximately 500 women are enrolled in the 
Weekend College. Median age is 33 years. About 90% of these women 
currently hold full-time jobs in the Milwaukee area. 



21 



/ 

17 



can be expected to have a cognitive-developmental component, we felt it 
wise to select a group of students who could be expected to have developed 
the Valuing ability to some extent even though they had not had college 
instruction. (Median age for the WEC is 33; WDq median age is 22 years.) % 
Further, a true pre- and post-instruction comparison is not genera^Tly 
feasible because constant changes in the curriculum and instruments . h 
preclude giving the same instrument to the Same students before and after 
instruction (two years would separate the two administrations of the 
instrument). We were also interested in comparing two groups that are most 
dissimilar given the type of construct validity we were using. 

Procedure 

Students who have had instruction "participate" in the validation study 
as part of the assessment process. How did we motivate new students in 
the uninstructed group on the second day of theit college career to take 
these instruments? We were concerned that askingNthem to "take tests" 
would increase anxiety and influence the results. We\ tried to resolve 
this problem by presenting a rationale to the uninstructed group. 
The Office of Evaluation administered the Learning Styla Inventory^ to 
the group on a Friday night and provided feedback on Sunday just before 
they were involved in the validation study. In a talk th the students, 
the Director of Evaluation labeled the instruments "practjice assessments" 
and called attention to the positive outcomes of participation. She 
suggested that taking a. practice assessment would assist them\by answering 
the following questions usually raised by t^w students; J 



i , . 



L v V 

David A. Kolb, Learning Style Ijiyejjtoryj jag Sco_r kng Test and 

Interpretation Booklet (Boston: McBer & Company, 1976).*^ 



22 



18 



• What can I learn about myself that will assist me in becoming a 
better learner (e.g., feedback on the Learning Style Inventory)? 

• What is meant by a "competence"? 

• What is meant by "demonstrating a competence"? 

• What are assessments like here at Alverno? 

• What are my initial capabilities? 

• Would I be as competent if I hadn't come\ to college? 

Since the uninstructed group already had feedback on the Learning 
Style Inventory, we did not feel we had to provide feedback' on their 
performance, a usual college procedure on any assessment. Giving students 
individual feedback on assessments used for research pur-poses would 
overburden the assessment system. We observed that the uninstructed group 

did seem to take their performance on the instruments seriously. 

i 

Instrument 

The following paragraphs describe the Valuing generic instrument: 

"In 1977, the Valuing Division followed the suggestion of 
the Assessment Committee (charged with the^pol lege-wide 
evaluation, revision, and validation of assessment instruments) 
to develop a "generic" instrument that would examine student 
performance across curriculum levels of Valuing. The result 
was an additional instrument — a "generic instrument^' — to assess 
levels 1 through 4 which all students would demonstrate at the 
end of their general education sequence. Because its content 
and setting were external to any of the student's course 
/ experiences, this instrument was expected to provide a summative 

I .assessment of her development in valuing to this point. This 

X^^-^ generic instrument reiterates every one of the criteria by which 
the student's several instructors have p^sessed her developing 
ability up to level 4. Yet it applies them as part of a tool 
which is in no way dependent upon the specific assessments or 
courses she has taken. 

<. 

Space does not permit more than a brief description of this 
generic instrument . it cons is ts of four parts that ask the 
student (1) to infer values from a 1 iteraryO work ; (2) to analyze 
the relationship of values to scientific 3nd technological 
developments; (3) to participate in a moral dilemma group v - 
discussion; and (4) to analyze her own decision-making process. 



23- 



19 



Various sets of stimuli can be developed for the Ins rument, 
reflecting a range, of issues. One such set involves students in 
the issue of genetic engineering — using a short story, newspaper 
article and an article from a scientific journal, a moral dilemma, 
and directions for her response to "each. She is first asked to 
compare the values she infers from the short story to her own 
value system, and then to that of American society. She then 
writes an editorial for either the local newspaper or a scientific 
journal on ! How our decisions regarding scieitific developments 
influence our value systems, cause value conflict., and raise 
questions regarding the relationship between private decision- 
making and public policy. 1 She next participates in the facilitator- 
led small group discussion of a moral dilemma, and then analyzes 
her own decision-making process throughout the experience and 
writes a letter to a congressman on genetic screening 'stating 
her case, describing her action plan and relating how her own 
values mot ivated her decision. 1 

The student's performance is measured according to 67 
criteria in all: 

29 of these repeat the faculty's criteria from levels 1 
through 4 on which she has'already beea whol iy or partly 
; c red en t ialed ; 

\ 21 were developed by the Valuing Division for the student's 
se If -assessment on the moral d i scussion ; and 

17 were developed by the Division for instructor assessment 
of the student's participation in the moral discussion and 
for tallying the occurrence of her use of the various modes 
of judgment, her identifying of moral issues and moral, 
orientations categorized in Kohlberg et al. , Standard Form 
Scoring M a nual (1978). ( \ 

The student is credentialed oiy the 29 'level' criteria, and 
so these criteria were submitted for validation. The 17 criteria 
for judging her performance in the discussion also help form a 
basis for ereden t ia 1 ing judgments on some of those 29 criteria, 
such as 'Recognizes necessity for and utilizes information and 
knowledge in moral reasoning, judging and deciding 1 or 'Articulates 
the point of view of another person or position with empathy and 
reason „ 1 ^ 

What has been, created in this 'generic' instrument is an 
opportunity to, elicit and examine the moral reasoning of college 
students in several situations, to view and analyze their 
participation in a moral di lemma discussion and to judge the 
discussion's effectiveness." (Mentkowski, 1980, pp. 42-44). 

No cutoff points for credent ia I. i ng had been specified for this instrument 



Thus there is some basis for generalizing from the results on the 29 
criteria to the 17 criteria. The 21 criteria by which the sjiudent assesses 
herself were not included in this validation study. 

ERIC 24 



since one purpose oi this study was to assist faculty tn identiiv the ran;.; 
of stud.-nt performance and to provide data for instrument rvvisiim. Some 
the 29 criteria assess only one level, some all four' levels, and some more 
than one level. 

One member of the Valuing Division evaluated both groups to insure 
consistency in tk scoring procedures. While reviewing a student's respom 
to the stimuli the scorer was looking for evidence in the student's work 
which would meet each criterion. When sucly evidence was identified, the 
criterion was checked. The more criteria checked, the higher the student 1 * 
score. Thus, by score, we actually mean the number of checked c r i te ria . 

Students 1 checkmarks on each of the criteria were tabulated separate]) 
lor the instructed and IminstrucLed group providing frequency of response 
per criterion within each grou^ each student's total score, and tile total 
number of level a checkmarks per student, 

Rj^suJ_t_s 

The first analysis asked: 

• How does the performance of the instructed group compare to the 
performance of the un ins t; rue t ed group on each criterion? 

Table L shows the frequency of student, responses (checked criteria) to 

i-'U'h oi t lit : 2 l ) criteria. Frequency of response is reported in pe r v. ent a^!^ . 

- the aiiinstrurt ed group performed better or equal (o the rnst.ruetod croup 

the criterion is labeled "non-d i sc r i mi im t i ve . " If the instructed grouj? 

performed better than the un i ns l rue t ed group, the criterion is labeled l ' 

"J i sc r iiiii nat i ve" and s t ai red ' « ) . 

ti»i'- preliminary :tage, 1 ah le 1 ■ was created Lo provide a de^criptiv 
oiialysis onlv, rather : ban a stoti.-.iieal ana ! \ > i s . 1 The extent :<■ which tit 

'in the iollowing Commun i eat i ems study, a statist iral analysis was 
employed to ideni i s y discriminative i t ems . 



21 



T::ble 1 

Criterion Response Frequency (Percent) for 
Instructed and Un instructed Students 







Cr x t er ia 


and Leve 1 s 


Assessed 








LEVEL I 












Croups 


1 


2- 3* 


5* 


6 

6* 


7* 


16* 


Un ins t rue t ed 


1 00 


A3 15 


100 , 30 


7 5 


5 5 


F> 


Instructed 


100 


100 2 7 


100 81 


100 • 


72 


63 




LEVELS 


2 & 3 












lj^ 


19* 20 


? ] 2 2 : ' : 


2 A 






I'n instrurti d 




r ) '3 5 


0 u 


3 j 






Ins t rue ted 




1 8 3»5 


( i 2 7 








- ■ "4 — 


LEVEL 4 
8* 




J I* L2* 


LJ-* 




l r >" 18 2 V" 


I'n j n f fji - r <•» r \ 
' > 1 1 t l . » L i t.i v. t. C. ^1 




n *? 




^ > 


i .) 


ju rtu .> 


Instructed 


- 1 8 


0 90 


54 2 7 


27 v 




36/. 12 ') 
, — M — , , 


















LEVELS 


3 & '» 












2 7 *• f 


'1 O i(i 

2 b" 2 










L r n i nr. t. r u'i: ted 




2 'j ! n 










I ns t rue ted 


•j 


0 3 i H 












LEVELS 


2 h 










I'n i nst nu- ed 


50 


20 










Ins true ted 


0 


0 










NO'i K: I'm ;.T \ v: 




r i I i*r i 


■a.nMVd . 









9 1 
GO 



ERIC 



27 



( :u- t wo i-;r ;u:ps , us d-.-so r i S-d s i ^ S : udv A . 
ft ' i a- i u; u r» ■ revision i > s ! he ins: runii-n t : 

' >iHI!!.; r , I'lOi l?:i.S i'A?i:iparv Wits 

n oi the diso Lp! iiu-V 



iv s :is L ru;:;eut 
' L i v l-1 ii- 1 t ,.*| 



■■;h»M ! d i.hr number of i n?j t ences of suppor r ivc 
• s i udeii t ' s work a f ier t our dec i s i : \v, to c roden J i m 1 
•.•[:).. - ■ . 

: o d if forcnt or i ter ia? !i«>w 
: -' ,-n ^ • irrpuct '>ur s ■ r i. -df ■ n f. i a 3 i m; o( students? 

. :, a-'u!ty also i!ene r.i ta?d quest ions that went beyond concerns; for- ! Ik- 
's 
\ 

atent validity or cue instrument, and our study responded to lh\m : 

• Art* we concerned with 1 00". mastery at all levels? Ac level U'l 
Or rather, are we concerned with roducinr. variability of student 
purl ormam c (mastery student;-, should tal aito a narrow 

pe rf omanee ran^o XV 

# Are tin: competence .levels sequential? If a student attained 
level a and not levels 2 and 3. what does that tell us? Are 
level 2 and 3 criteria clear enough for instructional purposes? 
How are levels 2 and 3 Linked to the developmental sequence? 

The first analysis asked: * *? 

• lo what extent does reduction in variability of Student 
performance imply effectiveness of instruction? 

it was decided to further investigate variation in total score perfor- 
mance o! the instructed group compared to the un instructed group, Ttu: 
tin ; ns true U.-d >'.roup responses formed a norma] distribution whereas the 

■> 



2S 



2 a 

i nst rKot ed f.roup farmed a negatively skewed d ist r i hut ion whore most of 
tl: t . student.-. : i.' 1 1 at. the mean ran^c or above (Figure I). The omc^a 
:«quMvd statist, is- 1 " (Hays, i 973) • which estimates the amount of statistic 

<-'C iat i >a impliod by the obtained difference between means (w" = .22), 
■•"tr;.--est toai 22.' the instructed group total score variation (K-.vb; I 
thn.Migh j wn s due to i nst ruet Ion . 

70 \ Po i nt o f 

I n t ujrst'C t ion .... 




0-* 4-8 8-12 12-1 6 16-20 20-24 

Valuing Total Score 

Figure 1. Frequency polygons of Valuing total scores for 
i n s t r uc t e d a n d u n i n s t r tie t e d students. 



Note : Po i n t of inter so e t i on c an l>f^con s ide r e d a s an opt ima 1 cutoff 
point for an acceptable level of performance (Berk, 1.976). 

The level 4 score distribution was then examined. Figure 2 shows 
that the un instruct ed group displayed a positively skewed distribution 
whereas the instructed group formed a normal distribution. The small 
amount <^f overlap between the distributions of the two groups (Figure. 2) 

t + Nj -f N 2 - 1 ' 

23 



25 



shows that instruct iona! i n l e rven L i on is cni.H-t.ivc in d i fferont. i at i in-. 
lu-tvi.-wn the twi' j'.ri.iiji;; , ami h.: niuirc eifect. ive at level /* than in all four 
1 vc 3 s co mh i ned . 



Point of 
Intersection 



Instructed 
Uni instructed 




8-10 



Valu i ng Score 



Figure 2. Frequency polygons of Valuing Level A scores for 
instructed and uninstructed students. 

^PJ£ : Point of intersection can he considered as an optimal cutoff 
point for an acceptable level of performance (Berk, J 9 76) . 



Omega squared statistic computed on level 4 scores (w 2 = .37) 
estimated that 37a of the variation in the instructed group may be accounted 
for by the instructional treatment. Variability within the Instructed group 



at level. 4 



(_SD - 1.6) was smaller compared to the uninstructed group 



(SD - 1.9). Wlien the mean scores from levels- 1 through 3 were examined, 
no significant differences were obtained between the two groups. 
The second analysis asked: 

\, 

• Are the competence levels sequential, developmental 
and cumulat i ve? 



9 

ERLC 



30 



26 



0 

ERIC 



One correlation matrix was generated from the responses of the 
instructed group and another from the un instructed group to assist in 
investigating the following quest; ions: 

• To what extent are the 4 levels of the Valuing competence, assessed 
by the generic instrument, sequential? Our assumption is that 
criteria at one competence level will intercorrelate more highly : 
with each other than with criteria at different competence levels. 

• To what extent do the criteria reflect a cumulative, developmental 
sequence? A. particular competence level should correlate rnorp 
highly with preceding levels than with the next level in the 
sequence. For example, there should be a higher correlation between 
level 3 and levels 1 and 2 and a lower correlat iorrd>e tween level 3 
and level 4. We assume that a student Who has reached level 3 has 
also mastered the 'first two levels. 

In the instructed: group, the correlation matrix did not shew clusters 
of higher intercorrelations among the criteria at each level. 1 This matrix 
did not support the sequentially of levels 1 to 4. However, there is some 
evidence for the cumulative nature of the levels, because criteria at the 
higher levels tend to form clusters of intercorrelationT>Kh lower level 
criteria but not with upper level criteria. For example, stude-.ts who 
responded to level 4 criteria tended to'respond to levels 1, 2 and 3 
as we] 1. 



The correlation matrix from the unfnstructed group did not show 
consistent patterns of correlations that would support the sequential or 
cumulative nature of the competence. 



' i scussion 



Presenting the preliminary criteria evaluation and the results 
supporting the overall validity of the instrument in measuring instruction 
to the Valuing Division in a simple descriptive manner proved to be 
effective in stimulating a group of interdisciplinary liberal arts faculty 

Correlation matrices are available from the authors. 



31 



27 

to become involved with instrument validation. They gent-rated qui'St ions 
which directed add i t i ona I analyses. The descriptive analysis of the 
criteria's interna] consistency wqis effective in identifying criteria with 
potential, "problems." When faculty started to explore alternative reasons 
for why\those criteria were non-discriminative, the broad nature of the 
competence itself was discussed, and faculty raised issues related to 
cons t met va .1 t d i ty . 

The graphic presentation of the frequency distribution of scores 
dr-nwit i caJ ly demonstrated that the instrument was effective i.n discriminating 
between the instructed and uninst rue ted groups (presumably measuring the 
effects of instruction), providing a powerful motivator for faculty to 
continue instrument validation. \_, 

■ The analysis indicated that Weekend students (median age, '3 3) enter 
college with similar Valuing abilities compared to Weekday students 
(median age, 22) at levels 1 through 3 (no significant differences were 
found when levels 1 to 3 scores were compared between the two groups). 
In contrast, the instructed **g>H£ip made a successful leap to level 4. 
Instruction appears to be effective in reducing instructed students 1 varia- 
tion at level 4 as compared to the uninst rue ted gioup and brings instructed 
students closer togqther on the mastery continuum. The magnitude of the 
instructional effect accounted for 37% of the variance at level 4 within 
the instructed group. 

The correlation matrices did not support the sequence of the competence 

* v 

levels. But a correlation matrix based on 11 students is hardly a basis 

>* 

for generating conclusions. The cumulat ive nature of the competence levels 
was more apparent in the performance of the instructed group, than the 
uninst rue ted group, and so receives some support. 



32 



The very fact: that older, more experienced women (WF.C) performed 
lower on level 4 criteria illustrates that instruction was effective 
for the younger Weekday students and not due entirely to maturation. The 
lack of significant differences between the instructed and uninstructed 
groups at levels 1, 2 and 3 provided a stimulus to the Valuing Division 
members who then rewrote and clarified criteria at levels 1 through 4. 
\. Members are currently meeting with faculty who teach Valuing, to introduce 
them to the revised criteria. Further, Division members concluded 
that older, more experienced women were not as likely to need "in-depth 
instruction on the awareness of their values (level 1), but that there 
is a "leap" in/the Valuing ability that is enabled by college instruction. 



Valuing Division members currently question the extent to which^levels 2; 
and 3 are actually sequential in nature as they are currently defined. 

The small sample size did not allow us to form definite conclusions 
in response to the questions raised. The study did allow us to test 
out a process for validating instruments incorporat in^*-£aculty questions 
and feedback, and to try out various methods. The s'tudy outcomes 
stimulated further testing of validation strategies which helped us to 
build a conceptual framework for validating assessment techniques 



applicable across the various competences and disciplines. A similar 

study on a larger group of students will provide a basis for possible 

cutoff points for accepted levels of performance and also yield more 

information about entry or baseline performance. 





COMMUNICATIONS GENERIC INSTRUMENT 



Me thod 

Sju bjjec ts 

The generic instrument assessing the Communications competence, level 4, 
was administered in January, 1978 to a group of 20 new students entering 
Weekend College (WEC) 1 " who had no previous instruction at Alverno College 
(uninstructed group). During Spring, 1978, the same generic instrument was 
administered to 20 Weekday . Col lege students (WDC) I contracted for an assess- 
ment of Communications, level 4 as part of their learning sequence (instructed 
group). Level. A is usually achieved at the eryl of the general education 
sequence after two years in college. 

" > 4 ' 

A g£oup of uninstructetf^tudents was selected from all WEC entering 
students (n = 60) who had some previous course work in a content area 
because the Communications ^instrument at level 4 demands that performance 
of the competence be integrated with content. The most frequent common 
content base was a- previous course or two in psychology. Twenty students 
were then randomly selected "from the group who had soml psychology to take 
the Communications instruments (uninstructed group). The instruments were 
administered to those in the instructed group with a comparable psychology 
background. N 



Weekend College studehts are generally older experienced women who 
hold a full-time job during the weekdays and come to the college full time 
in a weekend time frame that allows achievement of a degree in four years. 
The Weekday College students are generally younger full-time students, most 
of whom enter college after graduating from high school. 



34 



30 

De_s j £i) 

We selected students from Weekend College for this comparison, 
since the Communications competence can be expected to develop to some 
extent due to life experience'. We felt it wise to select a group of student 
who could be expected to have developed the Communications ability without 
formal instruction after high school. 

Further, a true pre- and post-instruction comparison is not generally 
feasible because constant changes in the curriculum and instruments preclude 
KJvinjr the same instrument before (beginning of the first year) and after 
(end of the second year) instruction. For the type of construct validity 
we were employing, we were interested in comparing two groups that are 
diss imilar . 

Instrument 



The generic instrument designed by the Communications Division to 
assess effective Communications as an outcome of ^general education 
integrates several modes of communication. It assesses Writing, Speaking, 
use of Media and analytic Reading and Listening from college-entry level 
to the summative performance level which represents the completion of 
general education. 

This instrument involves content — though not necessarily a specific 
course — because it assesse^the communication of concepts related to an 
academic discipline or a comparable area of study. In the form of the 
instrument that is administered to instructed and uninstructed groups in 
this study, the content, is psychology. However, the format and criteria 
are sufficiently generic to permit substitution of different content, wittl 
relative ease. In effect, the instrument provides a criterion measure 
that is external to courses. The Communicat ions Division set criteria 



35 



-31 

an'd provide: assessors. The criteria in the generic instrument are tiie same 
as those by which the student is assessed for Communications in aJj of her 
courses. 

Specifically the Instrument consists of four parts: (1) Directions to 
prepare and t ntually give a speech (inciuding use of a visual); (2) Direc- 
\^ t Lions U) write a letter; (3) An article to read; and (4) A taped lecture to 
listen to. A letter provides the initial stimulus and establishes the 
setting and context. In addition to answering a series of open-ended 
questions to analyze an article and a lecture, the student is required to 
take new information from these two sources and integrate it into her 
present understanding of the concept involved. (In the form of the 
instrument used for this study, the concept is "the influence of an infant's 
environment on human development.") The student is also required to assess 
herself in each of the Communications modes involved. 

The student's performance is measured by a total of 64 criteria: 

19 assess Writing performance 

C 

27 assess Speaking, including Media performance 

9 assess Reading performance / 

9 assess Listening performance 
The Communications battery is a generic assessment technique used to 
credential students at level 4 of the Communications competence. Since 
Alverno faculty view the Communicat ions competence as a developmental, 
pedagogical sequence, competence levels 1 through 4 aare cumulative and 
sequential. Students who wish to be credentialed at level 4 must again 
demonstrate satisfactory performance on the three preceding levels for 
which they have already been credentialed. Thus eacah of the four exercises 
is divided into four hierarchical levels. The generi^assessment technique 

ERIC - 36 _ 



32 



is criterion referenced ^incc each competence level status exp licit ! v 
tin* behavioral criteria for satisfactory per fo rmanee . The student doer, 
nut respond directly to the instrument eric La, but reacts to .stimuli 
designed for each of the four communication modes and her performance 
is judged by faculty who evaluate her demonstrated behavior against, the 
defined criteria, .^ncc the assessor finds evidence in the student 
behavior which meets the criterion, the criterion is checked. The student 
has mastered the competence level if all criteria specified for that level 
are checked. She will be credentialed for level 4 if the preceding three 
levels and all -the criteria at level 4 are checked. Each exercise 
provides a score for eacl^* the four competence levels, a total score for 
the exercise mode, and a combined total sco,e for performance on tin 
Comrnun ica t ions competence . 



Results 



ie 



Based on our experience analyzing Mae Valuing i, urn- t, v. 

the same set of faculty-generated questions *to begin analvsis of the 

Communications data. The first question guiding the analysis was: 

• How does the performance of the instructed group compare 
to the performance of the uninstructed group on each 
competence level within each Communications mode (Speaking, 
Writing, Listening, and Reading)? ) 

The mean and standard deviation per level within each Communications 
mode is presented in Table 1. Univariate ANOVA was employed to investi- 
gate significant differences between group performances. The univariate 
analysis shows significantly higher performance by the instructed group 
in Speaking, levels 2, 3, and 4; Writing, levels 2 and 3; Listening, 
levels 2 and 4; and Reading, levels 2 and 3. 



ERIC 37 



33 



a o I o 



Mij ;nir; ( M ) , S t a n d a r d D e v i a t ions ( S D ) , a n d t K . 1 I ios 
for i ns t: rue t ed - and L'n i ns t rue ted C roups \\>.v Lew 1 



w i till n 


Modes o f Commun i ca t ions 






Commun i cat ions 






.•up 
















Mode/ 


Un ins 


true; ted 


lii struct: ed 


F 


sal in 


Competence Level 


(n 


= 20) 


(n = 20) 






Speaking 












Level 1 




.8 










SD = 


.4 


.3 


- 




Leve 1 2 


M - 


11.2 


13.8 


1 7 


. 56** 




SD - 


2 . 2 


' 1.8 






Level 3 


M = 










3 • 3 




! 2 


33** 




SD - 


' ! "> 


^-i.fi 






Level 4 


M - 


1.1 


2.5 


•15 


.p5?«* , 




4 SD = 


. 9 


1 . 3 






'.. T ri. tini; 








') 




Leve J I 


M 


.8 


. 7 


/ 

\ 


. 1 /« 




SD 


. 4 


.4 




Lewi 2 


M = 


7.8 


. 7 


') 


98* 




SD - 


3.3 


2.5 






Luvel 3 


M = 


O T 
i. . f. 


3.9 


i 2 


4 7** 




SD = 


1. 3 


1.4 ' 






Leve I. 4 


U = 


.9 


1.2 


1 


80 




SD = 


.6 


.8 






Listening 












Level 1 


M = 


.9 


. 8 




l. *~ 




SD - 


. 3 


.4 






Level 2 


M = 


2.7 


3.0 


2. 


4 3* 




SD = 


- 7 


.0 


















Level 3 


M = 


1.0 


1.2 




67 




SD = 


.9 


.6 






Level 4 


M = 


. 7 


1.4 


5 . 


19** 




SD » 


.8 


1.2 






Reading 












Level 1 


M = 


a 
» o 


N Q 

» O 




00 


* 


SD = 


.4 


.4 






Level 2 


M - 


2.1 


2.8 


3. 


96** 




SD = 


1.0 


.4 






Level 3 


M = 


. 2 


.8 


13. 


56** 




SD = 


/ 


- 7 






Level 4 


M = 


.6 


-8 




43 




SD = 


.9 












T 









df (1, 38) 
*p "< .05 

** 2. < - 001 



38 



* whs! '■;.;! i'lif is reduction ei v.-i i ab i l i t y i i: th». i : . : r . 1 ,■ ; 
measuring the e f ! \-l t i venos : s oi instruction" 
;';:!) U' ! shi =ws a pa l t e rii ol" rod need vn r Lab i 1 i t v in i he InMru; *.'-a /re. 
< . . npa ro<: with the un i nst ruoi ed group within Speak : , I . \v 1 s I, >; 

Although the* ins true Led group demons t rates a higher mean pe r To rmap ». < • in 

level o! Speak i in;., Writ in);, Listening and Reading'., lh«' v.j r i . in i 1 1 i v 
v.itiiia Lais group was higher than In Lhe un i us t rtu t • >d gr<>up i io--i. ,«ad ><: 

I < ■ I , -a* expec Led . 

Mi further i nvest. i gate variation in student performance, the 

distribution oi the students' combined total scores in the Commun i cat ions 

compet once were examined (Figure 3). 

The uninstructed group displays a positively skewed distribution., Mo 
students fall in the Lower score range with a few in the higher score rang 
The instructed group displays a negatively skewed distribution. Most 
the students fall in the higher score range with a fev; in the lower. The 
amount of overlap between the two distributions may indicate the extent 
to which the instructional treatment (as measured by the instruments) 
discriminates h 'tween the two groups. Lack of overlap, or a small amount 
of overlap may indicate the diseri s.itive power c tie instrument caul 
its validity in measuring effectiveness ot i ns t rue t i on , The lined fl r 
in Figure 3 represents the amount of variation in • he I ns !. ruct ed ...Map 
whieh may be due to the i ns t r uc t i <. .p.a I treatnien:. How much variation could 
be attributed to the instructional Meat meat and how much to indKuds«! 



33 




i n 



(iiiu'r^/tis wi t. ii i n the inst rut: t ed r.roup: A t lost computed tVr : he 
(.'ontium i cat i ons total score for each ^roup ( i ! nu t ed M = 49. r > , Si' 1 
(u:unst ruetod M - '57.4, SD - 9.8) resulted in l\ significant value of 
t (38) - i> .01). The instructed ^roup performs s i i ii « ant i y 

ho i t or over a 1 1 . 

Uirie *.;a squared < t a t. i st-i c was computed to estimate h<»v much o! ? he 
variation in ;md«-ni performance can be attributed to the r-fteets -.o 
f reu. ruct ion. Twenty-sLx percent of the variations (v." * .2h) in Ut 
instructed #rom> is a 1 1. r i but ah ! e to the A i nst rue t iona 1 effects -n. a 
by t he inst rutnuiit . 



40 



•* - * K'« uiiiv:/rui: e AN' OVA <h^vs a i r ic/ini i higher r?H:*va 

:u'rr.?iT.a:u't: V.- nu- 1 J f: r ■^u j« a' the v^n^us l«;vtrls of the O.vxrjuns, • 

• >''U-r.!i !.».ir. .U ■.».??. !V.>! impiv f.h^-t the in.. . : group wrjs 3?a*5r \p 



37 



CUi~:h}iUir<- Values Comparing Mastery Students in 
Instructed and Uninstructed Groups * 



Mode 



! 

i 



rtastery Students 

Instructed Uninstructed 
(n « 20) (n-20) 



Chi- Square 



IB 
12 
10 



16 

3 

0 



8.9* 
4.1* 



ft* 



1 -vi •. 



I ft 



.0 
. y 

6 - 8 1 



. . e VI* 



i T 



17 
11 

0 



, 9 

a 

h?. 



ERIC 



Table™ 

Criterion Response Frequency within Four Modes of Communications 



Group 



Uninstructed 

t 

\ 

Instructed 



Chi -square 



SPEAKING Criteria 



2 3 < 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 



Level 1 Level 2 

16 i3 1? 14 16 15 16 7 20 16 20 8 8 20 17 14 
18 19 19 20 19 19 20 14 16 18 20 17 18 20 20 17 



Level 3 Level 4 

9 7 15 5 7 11 13 3 6 3 11 
16 17 17 13 13 17 17 9 12 14 17 



.19 .00 ,27 6.2 .91 1.7 2.5 3.6 .52 .( 
** * 



6.88.9 — 1.4.57 3.8 8.4.15 4.9 3,6 7.9 1.2 7.9 7.5 10.2 1.8 
** ** ***** h 



Group, 










WRITING Criteria 








28 


29 30 


31 


32 


33 34 35 36 37 38 39 


. 40 


41 42 43 44 


45 46 




Level 1- 








Level. 2 




Level. 3 


Level 4 


Uninstructed 


16 


18 15 


15 


10 


15' 10 13 16 11 18 16 


, tl 


10 4 9 14 


3 15 


Instructed 


15 


20 19 


17 


16 


1? 17 16 18 18 15 16 


17 


16 14 15 16 


9 15 


Chi- square 


.00 


.52 1.7 


.15 


2.7 


,15 4.1 .50 .19 4.5 .69 .15 


5.3 


2,7 8.1 2.6 .13 


2.9 .13 



*£ < .05 
**£ < .001 

Note : Scarred criteria were 
identified as discriminative 
items. 



CO 







LISTENING Criteria 






READING Criteria 


Croup 


47 


48 49 50 


51 52 


53 54 55 


Group 


56 


57 58 59 


60 61 


62 63 64 




Level 1 


Level 2 


Level 3 


Level 4 






Level 1 


Level 2 


Level 3 


Level 4 


Uninstructed 

V 


18 


18 19 18 


10 12 


16 0 10' 




Uninstructed 


17 


14 17 12 


0 4 


0 0 10 


Instructed 


17 


20 20 20 


7 18 


■4 

8 7 14 




Instructed 


17 


20 20 17 


7 10 


3 15 9 


Chl-squam 


.00 


.62 .00 .52 


.40 4.5 
* 


1.0 6,2.93 
** 


flit-square 


.19 


4.9 1.4 2.0 
* 


6.2 2,1 
** 


1.4 3.6 1.0 
* 



o 

ERIC 



43 



39 



Speaking criterion 1 



CRITERIA 

Did not 



GROUPS 



Instructed 
n = 20 

tlninstructed 
n - 20 



16 


A 


18 


2 



The Chi-square analysis identified discriminative criteria reflecting 

the effectiveness of instruction. The significance lev*el of each Chi-square 

value as indicated in Table 3 for each criterion shows a 3trong association 

between grotip and criterion. Instructed students tend to perform better 

on the starred criteria (see Table 3). 

The following analysis was performed to validate the conceptual 

framework of the competence model : 

% Can we identify clusters of response to criteria in the instructed 
\croup that are different from the uninstructed group? Where do 
dtusters occur? 

Cluster analysis was employed on all 64 criteria within each mode of 
tho Communications competence. Twenty percent error level, was chosen as 
the level for comparison. This generated 10 clusters for each group. 

A number of similar clusters was formed in both groups. The first 
common cluster for both groups is formed within the Sp eaking mode. 
Students who are able to make inferences tend also to make relationships 
to other sources of information or among the parts of their speech. This 
ability appears to describe both groups, Independent of instruction. 
A second common cluster indicates a relationship between Spea king and 



'sec Donald J. Holdman, Fortran Progr ammi ng for t he Behaviora l 
Sciences (Chicago: Holt, Rinehart, & Winston, 1967). This program is 
based on an article by J. H. Ward, "Hierarchical Grouping to Optimize 
an Objective Function/' American Statistical Association Journal, 1963, 
58, 236-244. Additional subroutines were developed by Larry W. Claflien 
and Fred Ostnpik, University of Wisconsin-Milwaukee. 



40 



Listening. Students who support their statements with examples in the 



Speaking exercise, speak on their feet, create a positive image and 



the central idea of another speaker, state examples used by the speaker and 
distinguish a speaker's facts from his or her opinions. Since this cluster 
is formed in a similar manner in both groups, the assumption is made that 
such an ability is formed independent of instruction. 

The third common clust er indicates a similar characteristic within 
the- Writing mode. Students who distinguish among sources of information 
tend also to internalize ideas and structure their writings appropriately. 



The fourth common cluster indicates that students who are able to 
recognize their own strengths and weaknesses in Writing and Speaking tend 
to do so regardless of the number of self -assessment criteria involved. 



The instructed group demonstrates a number of clusters which differ 
from the uninstructed group. Some of the clusters formed within the 
instructed group raise the following questions regarding the competence: 

• Does the attainment of one mode of Communications facilitate 
attainment of other modes? 

• Is growth toward the mastery of the Communications competence a 
function of the student's initial competence level upon entering 

college? y 

The clusters formed withiai the instructed group presumably reflect 
the instructional treatment, and show a definite improved *issociation 
between Speaking and Writing abilities. This suggests that skills learned 

r 

in one mode may generalize to the other. Such a cluster is not evident 
in the uninstructed group, and suggests weak associations between 
Speaking and Wr i t. i nfi abi.liti.i-s. 

Since both groups formed a cluster of Writing abilities independent 




.his ability also appears to form independent of instruction. 




41 



of instruction, one may conclude that students entering college with certain 
Writing skills are, more likely to further develop their writing ability 
and to acquire Speaking skills as an instructional outcome. Students who 
lack such basic Writing skills may not develop all the required Writing 
and Speaking skills within a period of two years. 

The patterns of student response were further investigated employing 
correlation matrices: 

• To what extent can we attribute differences between the instructed 
and uninstructed groups to the cumulative nature of the competence 
levels? 

The correlation matrices indicate clearly that the instructed group 
demonstrates high^ intercorrelat ions among the criteria for levels 2, 3, 
and 4 within the \$ p£.a#ing mode, reflecting a cumulative pattern of student 
response. Students who mastered levels 3 and 4 tend to master level 2 
also. This pattern is not evident in the unins t ructJd group. (The previous 
analysis clearly indicated that instructed students master Speaking 
levels 2 y 3, and 4 significantly better than the uninstructed group.) 

The correlation patterns support the cluster analysis results. 
Similar patterns of high intercorrelat ions within the Writing mode for 
the two groups imply a Writing capability independent of instruction. 
(Within the Writing mode, the instructed group is more likely to achieve 
mastery than the uninstructed group only at level. 3,) 

Finally the question was raised: 

• To what extent can we attribute differences between the two groups 
to the sequential nature of the Communications competence? 

Cutcman Sealogram analysis (Hambleton, 19 79) was employed to explore 

the t h e o r e t: i c a 1 framework o f r; \ \ e p e ci a ™ < > g i. c a ] d e v t • 1 o p me n t a 1 s e q u 0 1 1 c: e o t t h e 

1 e v e .1 s w i thin 1 1 1 u f o u r mo d e s . 'i h e t; r d e r o f 1 1 1 e 1 e ve 1 h with i n e a cl 1 mode a s 

generated .by the Huttman Seal.ogram is presented 'u\ Table 4, The analysis 



ERJC 4 7 



Table 4 



Sequential Order of Levels Demonstrated by Instructed and 
Uninstructed Groups Within Each Mode of Communications 
as Generated by Guttman Scalogram Analysis 



1 1 m ik 

Order of Levels 



Grou P Speaking Writing Listening Reading 



















\ 






1 — 




Expected Sequence 
< 


I 


2 


3 4 


l 


2 3 


4 


1 


2 j\ 


4 


1 


2 3 


4 


Instructed Group 


I 


9 

L 




1 
1 


L j 


1, 


i 


i / 
1 4 


3 


1 


} 

* 

1 3 


i 

4 




COR? 


2 


l.OOO 


COR 


• .950 




COR 


■ 1.000 




COR * 


.92 






MHR 




.650 


MHR 


= ,612 




MMR 


- .88 




MMR = 


.86 






cos 0 




l.OOO ' 


COS 


■ .871. 




COS 


» 1.000 




COS « 


.45 




Uninstructed Group 


l 


3 


2 4 


1 


2 4 


3 


1 


2 4 


3 


1 


2 4 


3 




COR 




.975 


COR 


= 1.000 




COR 


" ' .950 




COR = 


1.000 




* 


MMR 


a 


.885 


MMR 


* .787 




MMR 


■ .937 




MMR = 


.85 






cos 


■ 


.778 


COS 


■ 1.000 




COS 


» .200 




COS * 


1.000 




— — 1 H 



























Coefficient of reproducibility. 
Minimum marginal reproducibility. 
Coefficient of scalability. 

49 



43 



indicates that the instructed group demonstrates a sequence identical to 
that of the expected order iii the Writing and Speaking modes. The Speaking 

levels show perfect scalability\and the Writing levels show high scalability. 

\ 

The uninstructed group do not follow the expected developmental sequence 
in Speaking and Writing. They do demonstrate perfect scalability in an 
order clearly their own. \^ 

The results may indicate that Alver^o is successful in teaching its 

\ 

students to follow a pedagogically specified developmental sequence 
throughout their learning experience and that uninstructed students adopt 
a different sequence of performance throughout their previous experience. 

In the Reading exercise, the instructed group demonstrates level 2 
before they do level 1, and the overall performance, is not entirely 
coherent within the group (COS = .45). The uninstructed group demonstrates 
level 4 before they do level 3, but the overall performance is coherent 
within the group (COS - 1.000). 

, Both groups demonstrate level 4 of the Listening exercise before they 
do level 3. However, the instructed group demonstrates perfect scalability 
(COS = 1.000); the coefficient of scalability is low (.20) in the uninstructed 
group. 

Di& cuss ion 

The Communications study results were first presented to the chair- 
person of the Communications Division. Another meeting was arranged 
during which all Communications Division members discussed the study, 
contributed to the interpretation of the data and its implication for 
ins t r ume n t rev i s ion. 

Each research question (listed below) is discussed in light of the 
statistical findings and the discussion with Communications Division 



50 



44 

members. / * 

/ 

The first question is: 

• How does the performance of the instructed group compare to the 
performance of the uninstructed group on each competence level 
within each Communications mode (Speaking, Writing, Listening, 
and Reading) ? ... 

The first competence level in each Communications mode consists of one 
self^-assessment criterion which was intentionally designed to be an easy 
one for the incoming student. Consequently both the instructed and 
uninstructed students performed equally well on level 1 criteria across 
four modes of Communications. Within the* remaining 2 to 4 levels (a total 
of 12) the instructed group per'to^ned significantly higher on 9 out of 
the 12 competence levels. 

The instructed group does not demonstrate l^gher performance in 
level 4 Writing, level 3 Listening, and level 4 Reading. The questions 
raised by faculty that will guide instrument criteria revision on these 
three levels are: 

1. To what extent are the criteria clearly defined? 

2. Are the criteria a sensitive measure of what was learned? 

3. Does the stimulus elicit the expected response? 

4. Is the assessor's interpretation of student performance 
consistent with the intended meaning of the criterion? 

Th e second question is: 

• To what extent is reduction of variability in the instructed group 
an indication of the validity of the instrument for measuring the 
effectiveness of instruction? 

We believe that reduction in variation of student performance is one 

indication of Instrument validity in our outcome-centered curriculum, 

where cutoff points are not always specified. However, the data analysis 

indicates that iihsolute statements on reduction in variability cannot be 



51 



45 

made for the entire instrument. Performance ^variability changed with the 
competence levels. The instructed group demonstrated increased variability 
at level 4 of all Communications modes, whereas at the lower levels there 
is some indication of a decreased pattern of variability within the 
instructed group compared to the uninstructed group. These findings suggest 
that individual differences play a greater role in the variation of perfor- 
mance at the higher level ' of Communications. The difference in variability 

at level 4 as compared to the preceding levels may also Vindicate the leap 

\ 

students are making in integrating competent and content. At the lower 

\ 

levels students follow explicit criteria which structure their learning. 

V ! 

Level 4 criteria, however, call for internalization of the competence with 
more emphasis on the content analysis. It is assumed that such an internal- 
ization process requires additional skills, bringing individual differences 
to tae fore. Figure 3 indicates that 45% of the instructed students are 
below the cutoff point (the point where the instructed and uninstructed 

c 

curves intersect). Thus, instruction was effective for 55%- of the instructed 
students. When verformance of individual students is followed across the 
four modes of Communications, it is apparent that the same group of students 
excel at the nigher levels, producing a more dispersed score distribution. 
When we attempt to measure change, individual differences account not only 
for attaining a competence level but also for the rate of attainment. The 
distribution of scores may indicate that either instruction was not as 
effective for 45% of. the students, or these students need more time to 
develop the required Communications skills. A final statement about the 
validity of the instrument with regard to reduction in instructed students' 
variability should integrate data pertaining to the rate of competence 
attainment, i.e., number of attempts to master a competence level, duration 

52 



46 

of learning experience prior to the mastery attempt, and consistency ui 
performance across four modes of Communications, 

The third -quest ion is: 

# To what extent did the instructed group master each competence level 
within each Communications r mode compared to the uninstructed. group? 

Comparison of mastery performance between the two groups indicated a 

signif icant'ly higher performance of the instructed group at levels 2,3, 

and A of the Speaking mode, level A of the Listening mode, arid level 3 of 

the Writing mode. The instructed group does not indicate high mastery 

performance on the Reading and Listening modes whereas, in the Writing mcAie 

the lack of significant differences in mastery performance are due to the 

fact that the uninstructed group performed as well as the instructed group 

(see Table 2) . 

The results of the present, study clearly distinguish between the 
Speaking and Writing components as one aspect of the Communications 
competence, and Reading and Listening as a somewhat different aspect. 
Alverno faculty were aware of such differences and identified the Reading 
exercise as "Analysis of .Written Verbal Construct/' The Listening 4 exercise 
is referred to as "Analysis of Oral Verbal Construct." In these modes of 
Communications students are required to demonstrate a greater degree of 
analytic: a 1 a b i 1 i t 1 e s / \ s w o 1 1 a s C o mm u n i. c ; 1 1: i o n s s k i 1 1 s . I n s t r u c I e d s t ud e n t s 
a J s o seen to v i. e w L i s t: e n i n g a n d R e a d i n g as pro p a r a tor y e x e r c i s e s f o r 
S p e a k i n g a n d Wr i t i n g . * Th L s ma y d i m i n i.;; h t h e s e r L o u s n e s s w i t h wh i c 1 1 t he y 
approach, the task. 

The s t a t i. s t i c a 1 a n a 1 y s i s d i r c r t s f nc.u] t v at t ent ion to t wo tump e t e n c e s 
( r a t he r t h a n o n e ) me a s u r e d by the inst.ru men t : A n a I v s i s a n d (J o mmu { i i c a i: i < ) n s . 
The Communi cat ions competence is defined as the process of effective 
clarification and involvement between a presenter .tnd an audience (which 



53 



4 7 

represents the Communications aspect of the instrument). Since the instrument 
is also measuring analytical skills how might we best .describe the Communica- 
tions aspect of Reading and Listening? Is it the way the student articulates 
her analytical skills? Is the analytical part inseparable . from her ability 
to effectively clarify and art ic date her message? Are Communications and 
Analysis two competences or one? If analytical skills are necessary for 
effective communications, the. instrument is considered tridimensional even 
though it measures an additional analytical dimension. 
The, fourth question is: 

f i 

• ]s the instructed group performing better on each of the 
criteria?. . ^ 

■** 

Sixteen criteria out. of 64 wen: found to discriminate between the 
instructed and un i nst rue ted students. However, the question is whether the 
statistical analysis employed (Ch i -square J is a suitable strategy for iden- 
tifying sucl) criteria. Should criteria be considered in terms of amount of 
effort and time Invested in rhe teaching process rather than being evaluated 
by the frequency, of student response (see Table 3)? Faculty decided to study 
the aspects that contribute to the discriminant power of a criterion-? 

The fifth question is: 

* Can we identify clusters of responses in the" i nst rue ted group 
which are different from the on ins t: rue ted group? Where do thev 
oc c.u r ? 

Cluster analysis demonstrates common abilities within either the Speaking, 
or Writing modes independent of instruct ion, suggesting that the uninstructcd 
group enters college with already acquired communications skills in Speaking 
and writing. However, instruction produces significantly greater performance 
by instructed stud-nts in the Speaking node hut not in the Writing mode. Tito 
cluster analysis also demonstrates a learned pattern or improved associations 



ERIC 54 



■ < .J,: r\-:,-r:! ■ ,■ .<-.;,■:-.. : , ... ,:■ . , 



\ 



ERIC 



5*o 



did 



C'i • i: i i MS- 



'■)''" y ."• • of ;. v .'.:!irr « : i j ; [ ; ... , 

■ •'■ ' y ? . e ; j nd v.j i j da :. 
■ <-■.-: .-. s f !.j'..U':if. po r i'o ?ar:aji<. e 
•• '• " diai-:n".s i a -a s. :i :. por:orni.'iM-, 
■'■ ■■■ ' an i i rvijh.u-k r.o s t udeiit.s . 



- i i"i ; :•. v :■■ :i ; 



ass ! ..: si s ; 



■ a tiiO. v i ' 



1 : 1 ;"■:"■'''•■■>•; in: ! r:iraeu: tli* v\> J . •prac-ii t and revision t.ivil 
* l : i ' ' ! ' : ' • ' •• • o : e\ r a 1 ua H ar, -e.>d vn I i dnt I or, . ran: i r v 
r *.-t.-tj k aaris.. : ■ i :a. r a?uert : dvsisai a:; ruiea as r ia.-v 
" :;! ;[:! i:5 ' : : : : ' • £ :"i::s.-:i? iwisi^:;. Mr,n, ;::reo:a.a:a . their in; ea i - 
■ : " 1 ' ; ^ ? ; ' ■•'ad" • • es and inl..*:; 

' v,t 1 : ,;at '■'■'.id ; reeer^-u in if.,.-, :sun-r s t i :nu 1 a ted facuUv 
:rr/:i!vf»:,. nr. In r.--iir i j : /:i;a:u; r assessru-ul frchniqucs. The 
" Vj-rv i'.'.insin!; r!n: j t«-d ;r,. r . v^. ■ m tasMa: uu. i on of t he empirical data 

»p;>as:va t-- 1 V -a- ::it r li.-ij r rv. ■ t ; va t a" r.wp i. i n> : with old issues ar 

■^ser -i! ; us ses -s-.-- , :or a. mi f . r:-. : la: r.-roncop t i.:a i I :•: i nr. a compel eno- 
o.-.' ! iai : ?.a , :ar is. i s : ■as.a s s . = r /asa--ine ' he ■. asTipet.ene v assessment, eriraaia 
: '" ,r >aise. r:iae re t |. i n ;. : ; n> . t ir>.t;rnet iot*,aI oh je. t i vvs . 

tv ^ t-::.parai- » i i 1. j u , t r .? r. i on s -a'^ivHis t r a l i- tau.- exta.-nt to whish a 
v/ *ri »'t :•* of ■ < : s-^aa ravs 3 y Ha; .asaditv; identify new issues, ail 
iiirtstcJ :«''..-ir: , ri a! i i i ;, : iir.y r,'*i"i Mndwu ■><.■ r i n ranicr mt*a.sure<l bv the 
in.- 1 rici^-nt uud.-r •.tmiy anJ th«- et f «•„ t i v..-n..-s.« of instruction, once y.wh <i 
link vk:tr.i>nstra: cd ;!cru.-;s all >:vm-ri.- i nsf ju;nw>nt s , the interna I validity 
of the curricu' a is strengthened- 



ERIC 



57 



Ill ■ = 



13 !"'. 



/ 



so u-1 ;. ! ra;;u- 



"u-v/i-rk for t'•)^■■ va ] Ldat ion of ,!sst ,i ;.v 
out. >-<v:k-~.-./nt. iTt-d i:urr i cul inn demanded that i [ hi- 
iEi<n; iff Liu t;»">rrip l: t crises and ! he 'ha rat* t. e r i s L i as ot 
•! !-■ 1 iij'.'wlt v auest Lens and rone*.' riv- . Tin' 

* he i n t ej. 1 ro t i orv^v- ; a i ! t he 
•a::!pe i enrer. del i ned as i-'orie r i < , dove 1 opmen t 



' i di-j i .a- . ra::i.''.v- »r 



mini i. a s Si/ c, in *n . f o * 



i ns t ru;' t, a: . 



as; :t 



'" ;t ''" t: 'in ie r«i ••>::■!.»:"!• t i.t: extent 'a 1 wnica. mat rue ted studonrs demount rale 
even no i \ y ae:s«ss a vara-! v n« sot tics and ..loinpt'tonca' i:ofnponi'nts . U r lu : 
erne : ai.Stv :ns-,o ..pi*;! a ro;:>pot enee 1 1\ C. * ■ ; I ^ sup.irat'.: 'onnipunr-n! s , flu-y 
,: ' i aa a .-a. a'.-. >ca - - i i y dev.- \ npnrnt a 1 sonpaen. e of acquiring these 

' aa* !:ii.-v em. lieve vi!i » i ! i lait- the learning of i he abiltt'v 

j.»-tjueaee , vhi)»; it raak.es sense j -aiapop i ca 1 i y , may not natch the develop 
lt abilirv as it ; acquired without the benefit of farm 
naent s wuw uav.- . a ?, . ijS ; v ».< i i fe ey.pt* r i oner may develop soma 
abilities ai their ova. Results iron: this study stimulated 
a at* : -a. i rav> aaaut trie extent to whoa oar a of the four cottpe t ence levels are 
truiv cumulative .tad sequential. The holistic nature of the competences is 
explored in tans >; . iv by exaaunfnv, the extent to which the ability Measured 
i s un i d frriens i on.i ! — espec. i a 1 1 y a t. r is upper i eve ] s . 

i he a^sessn.era. svsten; and techniques ertp loved at A Ivor no are designed 
to assess master'/ leairnim: for c reden ti a 1 i in; purposes. The present studv 
explored master;/ learning without specific reference to cutoff points. 
Instead, we focused on the reduction in variability within groups since a 
major problem -n ;neasur i n>» change and provt.h in a c:ompetonce-basod program 



58 



■■vi ( i f t ;■)<.• vmi' t <n 



a- r , I :j 7 ) - kedne t i en i a v. iri.it i on 



•tormunr 



lie a ran vt • sea;; r . 1 1 1* i ne l 



s t I v t ■ i. nd i v i iiu 1 1 d i I t • 



> »nraet: ear 



1 nd i.v i dun ! d 



jL in 



;iC '. ' • .in ' . . ii ii '*:C: -1' !"■' I e U r ! i I H i* 



i r i ■. i I i * v \ ■ : .■ ■ t ecle n t po r f o man i t.- . 
; v.- ir i a l i • Mi i a s f uden t per f o rra ■ d 
Fast r li r f. i (mm 1 me t hods can t hen foe us < > 
■ t hesv i nd i v i dun 1 d i f f erenc i n i o 
a ran: who in- i nd i v t dua 1 differ enoes are 



i nee r no ra t e 
t. i-a: roduee 
upward "iav\ 



i I v stai infills . Identrfyinri the areas whore ins true- 
in student pa r f o rmance mav far i 1 i tnte students' 



a*, w i rain r \ w l en ra s a a p r.aass . 
We wotild lik«- t" fi'iuli-wt additional studios (pre-- and post- instruct i un) 
on Lin: same studi-nt a, which will assist as in further exploring the issue of 
individual u i f f a rent e i -- . Such pro- and past: - ins t. rue f; i on studies wit] allow 
us ta explore the lowing f.:" !i M' ions waieh are central to demonstrating 
program *„• f f or t. i venes s : Does the program provide similar opportunity for 
(/.!>:!) student. I:, i hi. propuaii:. --.ire effective with the better student? Does 
it .stimulate e:>a.aai rrewih for a- exco ) 1 i np, student? 

in sura the empirical i 1 1 us r ra t i on* d virion s t rat i.'d the impor tanc e of 
!'eco ? oi i inv- tin- need for construct validation in outcome-centered programs 
in higher education. Studios compared competence definitions against 
empirical date aad provided a broader scope for understanding the competence 
cons t rue t and aene ra ted i mp I i c at ions i o r f u t ure i ns t rue t i on . A 1 i nk was 
e.rea t ed be t ween va ! I da l i on s t ud i e s a no i as t. rume n t. re & i s 1 on w i th t he f u 1 1 
cooperation of the faculty, finally, the studies support recognition of 
the importance of individual differences and their impact on competence 



ERLC 



53 



. 4 



/ 



( ' 1 »;:t| >«• i •■ ■ n »: e rn 'j.'!":,] ne ;ui 1 u: a. e d 



students' ;H'r:ViTU!ir, pna'ijrs >?r<'nter insight, into the meaning of cornpe t ence 
nsseo,7smero rLi... : i.i. n-n future studies will incorporate the present, 
lindiucs and wi I ] open now ways md t ra t or. i es for bettor unders t • :nd i :ic 
mitcorne-svute;- , ■ r« >a rams , and how i he validity of their assessment techniques 
•"at: > es tab ] i shed . 

■ >?tr «iwarei:t.- ■-. the r ra:n-wo rk ol r he "wrriculun from which in.str'»iHs 

>s <»ii(.i Le rel.au- t.u issues which lie at the heart, of the 
'•nrrieulum. hV suppnri attempts to establish the construct validity of the 
ccmpe i ■ eve.-, a-; a ma jer way ;.o es t ah 1 i sh instrument validity: 

a si^mi leant deve 1. opnient is ti;e recognition e : the need 
for fonstrucL validation studies. . . , The si:\e of the 
test development project will influence the scope and the 
number of construct validation studies, but clearly, more 
work is needed in this area. . , . The limit of those 
studies will be the level of creativity and ingenuity of 
t he researchers i nvo i ved CHamh 1 eton , 1978) 

We ac.ree with IJatnbleton that we are limited only by eur own level of. 

creativity and i rveau i * v . 



ERIC 



60 



■x fekences 




:.■..!•"., R. A. De'vumin.-ir of optimal cut tin;; scores in cr i tor iiSTM re fc reneed 
measurement . j-.Mirn;ii o; Exp e r i me ii ta 1 Education, 1.97b, 45 , 4 - 9' : ; 

Croabach , L. J. iVs: va 1 i .Lit; ion . In R. I.. Thorndike (Kd. ) > Educati on ai 
me a surei nc nt:. ^ 2 :\d ed, < . Wa sh i n £t--r. : American Council on Education, 



Fincher, C. Pro-.;!' am iroP.LL^rm,:. in higher educa i on, . M onitoring O ngoin g 
Pro cram ^ r J r d o d . ) , 19 7 o , 

Gamsan, Z. Assuring survival by r ra as fo rmine; a troubled program: Grand 

Valley State Colleges. in G. Grant & Associates (Eds.), On co mpecence . 
Wash i m;toe. ; J os soy- Ba s s , 1 9 7 9 . " ~~ 

hamh ie Lnn , K. :•' . , c< Ei gnur, I) . K. Or it en', on re re r e need test: developme nt 

and validation methods , AERA training program materials. University 
of Massachusetts, Amherst, 19 79. 

Hambleton, K . K. , o< Swaiu.nathan, :1. Or i tor ion - re fc renced testing and 

measurement: A review of technical issues and developments. Review o: 
''■'lii^ ^JLh! l 2iLl„ ^^'iil^ii' ^ ^-'^^j •- l- : *' : * 

hays, W . L . Statistics ior the Social S ciences ( d rid ed . ) . New Vo r k : 
holt, Riaeb<rt and Winston, 1973. 

King, E. S. Assessment oi competence: Technical problems and publications. 
In G. Grant & Associates (Eds..«, O n competence . Washington: Jossey- 
Bass, 1979. 

<* 

Xohlberg, I*. , Colby, A . , Glbbs, J., & Speiche r-Dub in , B. Standard Form 
Scoring Manual . Unpublished manuscript, Harvard University, 1978. 

Mcntkovski, M . Creating a "mindset" for evaJualinj; a liberal arts curriculum 
where "valuing" is a major outcome. In L. Kuhnerker, M . Mentkowski, 
and E. Erieksen (Eds.), Evaluating moral development: And evaluating 
educational, programs that have a value dimension. Schenectady. N . V . : 
Character Research Press, 1980. 

Mentkowski, M. , & Doherty, A. Careering alter college: Establishing the 
validi ty of abilities learne d in college for late r success . P rbpo s a 1 
funded by the National Institute of Education, September 1977. 

Mc-nckowski, M„ , & Doherty, A. Careering after college : Establish ins: the 
Y^lJAl^L2L J^jAi±l.£!L~ learned in college for later ca reerl ng and 
pr ofessi onal performance . Final report to the National Institute 
of Education. Milwaukee, WI: Alverno Productions, 1983. 



evaluation. Amer ican Psvchologist , 1974, 30, 9 5 5-966. 

Payne, J. Principles of social science measureme nt . College Station, 
^ , Tex.: Lytton Publishing company, 1975. 



1971. 




a nd v a 1 u e s i n me a s u r emo n t a ml 




ERIC 




Alverno College 



3401 Sooth 39th Street / Milwaukee. Wl 53215 



62 



